Skip to content

opt: five IR and LLVM optimisations for bfc#5

Merged
benmandrew merged 8 commits into
mainfrom
feature/optimisations
Jun 30, 2026
Merged

opt: five IR and LLVM optimisations for bfc#5
benmandrew merged 8 commits into
mainfrom
feature/optimisations

Conversation

@benmandrew

Copy link
Copy Markdown
Owner

Summary

  • Cancel opposing commands: adjacent +/- and >/< pairs are merged/cancelled in the IR phase (e.g. +++--+1).
  • CMD_CLEAR: the [-]/[+] zero-cell idiom is replaced with a single store i8 0 instead of a full loop.
  • CMD_MULTIPLY: multiply-add loops (e.g. [->+<]) are detected and replaced with counter load + mul/add + zero, eliminating the loop entirely. Supports up to 8 target cells within ±64 offset.
  • dp as alloca: the data-pointer is moved from a global to an alloca in main, enabling LLVM's mem2reg pass to promote it to a register.
  • LLVM pass pipeline (-O/--optimise): wires the already-parsed flag through generate() and runs mem2reg,instcombine,simplifycfg,gvn via LLVMRunPasses. Under -O, helloworld.b produces completely loop-free IR with constant-folded GEPs and no dp loads.

IR-level passes (1–3) always run; LLVM passes (5) are gated on -O. All existing tests pass and a new test/test_multiply.filecheck is added.

Test plan

  • cmake --build build --target tests passes (all FileCheck, unit, and expect tests)
  • build/bfc test/res/helloworld.b produces valid IR that compiles and runs correctly
  • build/bfc -O test/res/helloworld.b produces noticeably simpler IR (no %dp loads, no loop blocks)
  • build/bfi test/res/factorial.b still produces correct output

Adds optimise_program() called from both bfc and bfi after parsing.
The first pass (cancel_opposing) merges adjacent INC/DEC and RIGHT/LEFT
pairs, subtracting their counts and removing pairs that fully cancel.
Bracket jump indices are remapped after compaction.
Adds detect_clear_loops() pass: a loop whose body is a single INC or
DEC (any count) is replaced with the synthetic CMD_CLEAR node, emitting
a single store i8 0 in LLVM IR and a direct zero-assignment in the
interpreter. Updates test_simple_loop.filecheck to match.
Adds detect_multiply_loops() pass: a loop whose body contains only
+/-/</>  with net pointer delta 0 and loop-counter delta -1 is
replaced with CMD_MULTIPLY. Each non-counter cell touched becomes an
{offset, factor} pair. Codegen emits counter load, multiply-adds, then
store i8 0. Supports up to MULTIPLY_MOVES_MAX (8) target cells, offsets
within ±64. Adds test/res/multiply.b and test/test_multiply.filecheck.
Removes the @dp global variable and replaces it with an alloca in the
main function entry block. The LLVMValueRef ctx->dp is still a pointer
so all load/store callsites are unchanged. With LLVM's mem2reg pass
(applied under -O) the alloca is promoted to an SSA register, removing
all dp memory traffic. Updates all FileCheck tests accordingly.
Wires the already-parsed --optimise flag from main_bfc.c through
generate(program, optimise). When true, runs mem2reg,instcombine,
simplifycfg,gvn via LLVMRunPasses (new pass manager, LLVM >= 14).
mem2reg promotes the dp alloca to SSA registers; gvn eliminates
redundant loads; instcombine and simplifycfg clean up the result.
Adds the passes component to llvm_map_components_to_libnames.
FileCheck tests are unaffected as they do not pass -O.
Replace (unsigned long long) casts with (uint64_t) to satisfy cpplint
runtime/int rule. Re-run clang-format to fix indentation in multiply(),
detect_multiply_loops(), and the CMD_MULTIPLY interp case.
The clang static analyzer in CI flags SIZE_MAX (ir.c) and uint8_t
(interp.c) as undeclared without an explicit stdint.h include.
@benmandrew benmandrew merged commit 38c7265 into main Jun 30, 2026
7 checks passed
@benmandrew benmandrew deleted the feature/optimisations branch June 30, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant